Background - Hardware Acceleration to Offscreen GWorlds (ATI)
Overview
This sample is based on the "Background" QuickDraw3D sample app written by Nick Thompson at Apple. In fact, very little code has changed between the two versions. It is still a demo of how to use a pixmap draw context to draw 3D data over a background, i.e. both versions show how to use QuickDraw3D to render into offscreen GWorlds. However, in the current (ATI) version we build the GWorlds, so that they reside in VRAM on the hardware accelerator card. This allows ATI Rage II or Rage Pro chips to accelerate rendering to these GWorlds. Previously, this was only possible with the Apple accelerator card ("White Magic").
History
The history of this issue is roughly as follows: many applications need to do QuickDraw3D rendering to offscreen GWorlds in order to composite with background images, etc. Naturally, the memory buffers for the PixMaps of these GWorlds are allocated in system memory. The ATI Rage II and Rage Pro accelerator cards can only accelerate QuickDraw3D/RAVE rendering to buffers that reside on VRAM. The unfortunate result is that apps that use GWorlds do not get a speed boost from hardware acceleration. The solution to this problem is to allow GWorlds to be created so that their PixMap memory resides in VRAM.
The only question is how to provide this new capability to the NewGWorld() routine. One solution is for Apple to add a flag to NewGWorld() that would specify that the application wants a GWorld allocated on VRAM. For this to work with various hardware vendors cards would require Apple designing a generic API for allocating VRAM on accelerator cards' memory. This has been considered but it will likely be a while before it materializes.
ATI Solution
The modified "Background (ATI)" sample app, solves the problem of QuickDraw3D acceleration to GWorlds for all ATI accelerator cards (XClaimVR, XClaim3D w/ Rage2, XClaimVR, XClaim3D w/ Rage Pro, NexusGA). Here's how it works: ATI wrote a small shared library, the "Offscreen Mem Manager" to allow apps to allocate and free blocks of VRAM on ATI cards. "Background (ATI)" calls the standard NewGWorld() toolbox routine to allocate a GWorld in system memory. Then it deletes the GWorld's PixMap from system memory, and calls ATIMem_AllocVRAM() to allocate a block of VRAM. The pointer to this VRAM memory is stored back in the GWorld PixMap's baseAddr. So, now we've created a GWorld that has its PixMap in VRAM.
Acceleration
Once a GWorld has been modified to reside on VRAM, hardware will accelerate several crucial rendering operations. The "Background" app demonstrates a typical four step render loop:
1) CopyBits() offscreen image to offscreen "background" buffer
2) render QD3D scene to offscreen "render" buffer
3) CopyBits() contents of "render" buffer onto "background" buffer
4) CopyBits() new contents of "background" buffer to window on screen
Each of these steps will be accelerated by either the ATI 2D or 3D accelerators. In steps 1, 3, and 4 The 2D accelerator, "ATI Graphics Accelerator", will recognize that the buffers it is being asked to copy are both in VRAM and will issue hardware accelerated blit operation. Likewise, in step 3, when QuickDraw3D renders to this GWorld, and passes the buffer to RAVE, the ATI RAVE driver detects that the buffer actually resides in VRAM, and so will accelerate drawing to it.
Results
Testing on a PowerPC 9600, 64 MB RAM, 225 Mhz, 640x480x32, comparing the original "Background" sample app with the modified "Background (ATI)" version, the results are as follows:
Background (Original): 18 fps
Background (ATI): 91 fps
A rather impressive improvement :-)
Limitations
The ATI solution has clear benefits, but also has a couple of limitations. First, GWorld buffers which reside on VRAM need to be created with the same depth as the hardware display on which they are stored. Second, a mode switch can necessitate discarding the contents of VRAM. In this event, a callback is issued to the calling application, which the app needs to handle gracefully. Probably this will entail reallocating regular GWorlds and rerendering. The sample app, catches these events and simply terminates.
Another limitation is that once a GWorld is modified so that its Pixmap memory resides on VRAM, it is not
"legal" to call UnlockPixels() on that GWorld's PixMap. LockPixels() is fine, and causes the baseAddr field in the GWorld's PixMap to be converted from a handle to a pointer. UnlockPixels(), on the other hand changes the handle back to a pointer. Because the memory now resides on VRAM, UnlockPixels() will result in the
baseAddr field being set to NULL.
Acknowledgements
Thanks very much to Bruce Parke of ATI for making these optimizations possible.